Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 523890 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 41.5 MiB |
| Average record size in memory | 83.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Boolean | 3 |
| Categorical | 1 |
pos_r is highly correlated with pos_t | High correlation |
pos_phi is highly correlated with mom_phi | High correlation |
pos_t is highly correlated with pos_r | High correlation |
mom_p is highly correlated with isHiggs and 2 other fields | High correlation |
mom_phi is highly correlated with pos_phi | High correlation |
isHiggs is highly correlated with mom_p and 2 other fields | High correlation |
isZ is highly correlated with mom_p and 2 other fields | High correlation |
label is highly correlated with mom_p and 2 other fields | High correlation |
pos_r is highly correlated with pos_theta and 1 other fields | High correlation |
pos_theta is highly correlated with pos_r and 1 other fields | High correlation |
pos_phi is highly correlated with mom_phi | High correlation |
pos_t is highly correlated with pos_r and 1 other fields | High correlation |
mom_phi is highly correlated with pos_phi | High correlation |
isHiggs is highly correlated with isZ and 1 other fields | High correlation |
isZ is highly correlated with isHiggs and 1 other fields | High correlation |
label is highly correlated with isHiggs and 1 other fields | High correlation |
pos_r is highly correlated with pos_theta and 1 other fields | High correlation |
pos_theta is highly correlated with pos_r and 1 other fields | High correlation |
pos_phi is highly correlated with mom_phi | High correlation |
pos_t is highly correlated with pos_r and 1 other fields | High correlation |
mom_phi is highly correlated with pos_phi | High correlation |
isHiggs is highly correlated with isZ and 1 other fields | High correlation |
isZ is highly correlated with isHiggs and 1 other fields | High correlation |
label is highly correlated with isHiggs and 1 other fields | High correlation |
isOther is highly correlated with label | High correlation |
mom_mass is highly correlated with pid and 1 other fields | High correlation |
isHiggs is highly correlated with label and 2 other fields | High correlation |
pid is highly correlated with mom_mass | High correlation |
pos_t is highly correlated with pos_r | High correlation |
pos_r is highly correlated with pos_t | High correlation |
mom_theta is highly correlated with pos_theta | High correlation |
pos_phi is highly correlated with pos_theta and 1 other fields | High correlation |
pos_theta is highly correlated with mom_mass and 2 other fields | High correlation |
label is highly correlated with isOther and 3 other fields | High correlation |
mom_phi is highly correlated with pos_phi | High correlation |
mom_p is highly correlated with isHiggs and 2 other fields | High correlation |
isZ is highly correlated with isHiggs and 2 other fields | High correlation |
isHiggs is highly correlated with isZ and 1 other fields | High correlation |
isOther is highly correlated with label | High correlation |
isZ is highly correlated with isHiggs and 1 other fields | High correlation |
label is highly correlated with isHiggs and 2 other fields | High correlation |
mom_p has unique values | Unique |
mom_theta has unique values | Unique |
mom_phi has unique values | Unique |
pos_r has 215111 (41.1%) zeros | Zeros |
pos_theta has 215111 (41.1%) zeros | Zeros |
pos_phi has 215111 (41.1%) zeros | Zeros |
pos_t has 215111 (41.1%) zeros | Zeros |
mom_mass has 261398 (49.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-07-04 12:24:16.352271 |
|---|---|
| Analysis finished | 2021-07-04 12:25:16.600422 |
| Duration | 1 minute and 0.25 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.32070282 |
| Minimum | -2212 |
|---|---|
| Maximum | 2212 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 136667 |
| Negative (%) | 26.1% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | -2212 |
|---|---|
| 5-th percentile | -211 |
| Q1 | -12 |
| median | 22 |
| Q3 | 130 |
| 95-th percentile | 211 |
| Maximum | 2212 |
| Range | 4424 |
| Interquartile range (IQR) | 142 |
Descriptive statistics
| Standard deviation | 475.4657624 |
|---|---|
| Coefficient of variation (CV) | 35.69374446 |
| Kurtosis | 15.74687602 |
| Mean | 13.32070282 |
| Median Absolute Deviation (MAD) | 34 |
| Skewness | -0.08537914953 |
| Sum | 6978583 |
| Variance | 226067.6912 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=20)
| Value | Count | Frequency (%) |
| 22 | 236924 | |
| 211 | 95863 | |
| -211 | 95654 | |
| -321 | 13840 | 2.6% |
| 321 | 13760 | 2.6% |
| 130 | 13503 | 2.6% |
| 12 | 11105 | 2.1% |
| -12 | 11060 | 2.1% |
| -2212 | 5873 | 1.1% |
| 2212 | 5798 | 1.1% |
| Other values (10) | 20510 | 3.9% |
| Value | Count | Frequency (%) |
| -2212 | 5873 | 1.1% |
| -2112 | 5534 | 1.1% |
| -321 | 13840 | 2.6% |
| -211 | 95654 | |
| -16 | 125 | < 0.1% |
| -14 | 1079 | 0.2% |
| -13 | 991 | 0.2% |
| -12 | 11060 | 2.1% |
| -11 | 2511 | 0.5% |
| 11 | 2466 | 0.5% |
| Value | Count | Frequency (%) |
| 2212 | 5798 | 1.1% |
| 2112 | 5609 | 1.1% |
| 321 | 13760 | 2.6% |
| 211 | 95863 | |
| 130 | 13503 | 2.6% |
| 22 | 236924 | |
| 16 | 125 | < 0.1% |
| 14 | 980 | 0.2% |
| 13 | 1090 | 0.2% |
| 12 | 11105 | 2.1% |
| Distinct | 147916 |
|---|---|
| Distinct (%) | 28.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.82737822 |
| Minimum | 0 |
|---|---|
| Maximum | 20675.42931 |
| Zeros | 215111 |
| Zeros (%) | 41.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 5.131018041 × 10-5 |
| Q3 | 0.2494305618 |
| 95-th percentile | 113.9591273 |
| Maximum | 20675.42931 |
| Range | 20675.42931 |
| Interquartile range (IQR) | 0.2494305618 |
Descriptive statistics
| Standard deviation | 251.6969232 |
|---|---|
| Coefficient of variation (CV) | 7.025267707 |
| Kurtosis | 572.0904676 |
| Mean | 35.82737822 |
| Median Absolute Deviation (MAD) | 5.131018041 × 10-5 |
| Skewness | 17.96524994 |
| Sum | 18769605.18 |
| Variance | 63351.34117 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 3.109283469 | 10 | < 0.1% |
| 0.2541469399 | 10 | < 0.1% |
| 1.412746095 | 9 | < 0.1% |
| 0.6010237942 | 9 | < 0.1% |
| 0.2115099823 | 9 | < 0.1% |
| 0.06376061778 | 8 | < 0.1% |
| 0.0425469599 | 8 | < 0.1% |
| 1.589326681 | 8 | < 0.1% |
| 0.3806201846 | 7 | < 0.1% |
| Other values (147906) | 308701 |
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 1.321541899 × 10-9 | 2 | < 0.1% |
| 3.288187914 × 10-9 | 2 | < 0.1% |
| 1.127240242 × 10-8 | 2 | < 0.1% |
| 1.174375335 × 10-8 | 2 | < 0.1% |
| 1.252954158 × 10-8 | 2 | < 0.1% |
| 1.388321658 × 10-8 | 2 | < 0.1% |
| 1.417202206 × 10-8 | 2 | < 0.1% |
| 1.424466981 × 10-8 | 2 | < 0.1% |
| 1.919384125 × 10-8 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 20675.42931 | 2 | |
| 13743.31145 | 2 | |
| 12255.97393 | 2 | |
| 11738.9314 | 2 | |
| 11052.18253 | 2 | |
| 10680.22521 | 2 | |
| 10315.76274 | 2 | |
| 10256.94449 | 2 | |
| 10256.94359 | 2 | |
| 10160.51661 | 2 |
| Distinct | 147916 |
|---|---|
| Distinct (%) | 28.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9272984453 |
| Minimum | 0 |
|---|---|
| Maximum | 3.130104787 |
| Zeros | 215111 |
| Zeros (%) | 41.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.8109428138 |
| Q3 | 1.724584543 |
| 95-th percentile | 2.542850852 |
| Maximum | 3.130104787 |
| Range | 3.130104787 |
| Interquartile range (IQR) | 1.724584543 |
Descriptive statistics
| Standard deviation | 0.9324573161 |
|---|---|
| Coefficient of variation (CV) | 1.005563334 |
| Kurtosis | -1.204525554 |
| Mean | 0.9272984453 |
| Median Absolute Deviation (MAD) | 0.8109428138 |
| Skewness | 0.4391705094 |
| Sum | 485802.3825 |
| Variance | 0.8694766464 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 0.8589685805 | 10 | < 0.1% |
| 2.625098116 | 10 | < 0.1% |
| 1.837176593 | 9 | < 0.1% |
| 1.520985848 | 9 | < 0.1% |
| 1.573417931 | 9 | < 0.1% |
| 1.837377081 | 8 | < 0.1% |
| 1.1480704 | 8 | < 0.1% |
| 0.6255795275 | 8 | < 0.1% |
| 2.444067965 | 7 | < 0.1% |
| Other values (147906) | 308701 |
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 0.01565781315 | 2 | < 0.1% |
| 0.01566686588 | 3 | < 0.1% |
| 0.01628507316 | 2 | < 0.1% |
| 0.01711867925 | 2 | < 0.1% |
| 0.01787172869 | 2 | < 0.1% |
| 0.01938913531 | 2 | < 0.1% |
| 0.02064406145 | 1 | < 0.1% |
| 0.02175498977 | 2 | < 0.1% |
| 0.02178225576 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 3.130104787 | 2 | |
| 3.128635859 | 2 | |
| 3.128635846 | 2 | |
| 3.128091755 | 3 | |
| 3.125393951 | 2 | |
| 3.123746626 | 2 | |
| 3.12224134 | 2 | |
| 3.121973987 | 2 | |
| 3.121037687 | 2 | |
| 3.12102188 | 2 |
| Distinct | 147864 |
|---|---|
| Distinct (%) | 28.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.003248919521 |
| Minimum | -3.141589072 |
|---|---|
| Maximum | 3.141488565 |
| Zeros | 215111 |
| Zeros (%) | 41.1% |
| Negative | 154953 |
| Negative (%) | 29.6% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | -3.141589072 |
|---|---|
| 5-th percentile | -2.609384477 |
| Q1 | -0.4842947578 |
| median | 0 |
| Q3 | 0.4649110527 |
| 95-th percentile | 2.607050637 |
| Maximum | 3.141488565 |
| Range | 6.283077636 |
| Interquartile range (IQR) | 0.9492058105 |
Descriptive statistics
| Standard deviation | 1.392739216 |
|---|---|
| Coefficient of variation (CV) | -428.6776594 |
| Kurtosis | 0.04902840625 |
| Mean | -0.003248919521 |
| Median Absolute Deviation (MAD) | 0.4740099868 |
| Skewness | -0.0001752078931 |
| Sum | -1702.076448 |
| Variance | 1.939722524 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| -1.481148159 | 10 | < 0.1% |
| 0.925401528 | 10 | < 0.1% |
| -2.178383369 | 9 | < 0.1% |
| -0.7076902522 | 9 | < 0.1% |
| -0.6443454288 | 9 | < 0.1% |
| -1.968330542 | 8 | < 0.1% |
| -1.167121913 | 8 | < 0.1% |
| 1.339836338 | 8 | < 0.1% |
| -1.501756522 | 7 | < 0.1% |
| Other values (147854) | 308701 |
| Value | Count | Frequency (%) |
| -3.141589072 | 2 | |
| -3.141333637 | 2 | |
| -3.141330808 | 3 | |
| -3.141218116 | 2 | |
| -3.141213569 | 2 | |
| -3.141151012 | 2 | |
| -3.14113539 | 2 | |
| -3.141133051 | 2 | |
| -3.141110815 | 2 | |
| -3.140993957 | 2 |
| Value | Count | Frequency (%) |
| 3.141488565 | 2 | |
| 3.141430697 | 2 | |
| 3.141361984 | 2 | |
| 3.141328549 | 2 | |
| 3.141316367 | 1 | |
| 3.14130782 | 2 | |
| 3.141238566 | 2 | |
| 3.141180817 | 2 | |
| 3.141154842 | 2 | |
| 3.141154481 | 2 |
| Distinct | 147669 |
|---|---|
| Distinct (%) | 28.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.46604689 |
| Minimum | 0 |
|---|---|
| Maximum | 20682.55744 |
| Zeros | 215111 |
| Zeros (%) | 41.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 5.469365928 × 10-5 |
| Q3 | 0.2592700083 |
| 95-th percentile | 122.7358632 |
| Maximum | 20682.55744 |
| Range | 20682.55744 |
| Interquartile range (IQR) | 0.2592700083 |
Descriptive statistics
| Standard deviation | 252.4496483 |
|---|---|
| Coefficient of variation (CV) | 6.922868525 |
| Kurtosis | 566.4122854 |
| Mean | 36.46604689 |
| Median Absolute Deviation (MAD) | 5.469365928 × 10-5 |
| Skewness | 17.8508682 |
| Sum | 19104197.31 |
| Variance | 63730.8249 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 0.2735360133 | 10 | < 0.1% |
| 3.208061146 | 10 | < 0.1% |
| 0.6883983004 | 9 | < 0.1% |
| 1.475074245 | 9 | < 0.1% |
| 0.6379185762 | 9 | < 0.1% |
| 1.817783237 | 8 | < 0.1% |
| 0.07353922287 | 8 | < 0.1% |
| 0.04902620546 | 8 | < 0.1% |
| 1.934518745 | 7 | < 0.1% |
| Other values (147659) | 308701 |
| Value | Count | Frequency (%) |
| 0 | 215111 | |
| 1.999456332 × 10-9 | 2 | < 0.1% |
| 5.437000528 × 10-9 | 2 | < 0.1% |
| 1.137207877 × 10-8 | 2 | < 0.1% |
| 1.266879093 × 10-8 | 2 | < 0.1% |
| 1.536747415 × 10-8 | 2 | < 0.1% |
| 1.594279841 × 10-8 | 2 | < 0.1% |
| 1.697779313 × 10-8 | 2 | < 0.1% |
| 1.945356913 × 10-8 | 2 | < 0.1% |
| 2.007894421 × 10-8 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 20682.55744 | 2 | |
| 13747.79149 | 2 | |
| 12270.57156 | 2 | |
| 11741.16556 | 2 | |
| 11054.72039 | 2 | |
| 10687.00584 | 2 | |
| 10316.4145 | 2 | |
| 10258.29859 | 4 | |
| 10173.44392 | 2 | |
| 10088.39541 | 2 |
| Distinct | 523890 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.5635358 |
| Minimum | 0.0004017200743 |
|---|---|
| Maximum | 78.1807838 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0.0004017200743 |
|---|---|
| 5-th percentile | 0.09219266417 |
| Q1 | 0.4276857475 |
| median | 1.194596419 |
| Q3 | 3.442734899 |
| 95-th percentile | 19.70707746 |
| Maximum | 78.1807838 |
| Range | 78.18038208 |
| Interquartile range (IQR) | 3.015049151 |
Descriptive statistics
| Standard deviation | 10.79091911 |
|---|---|
| Coefficient of variation (CV) | 2.364596135 |
| Kurtosis | 20.95740229 |
| Mean | 4.5635358 |
| Median Absolute Deviation (MAD) | 0.9549745514 |
| Skewness | 4.422035627 |
| Sum | 2390790.77 |
| Variance | 116.4439353 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.6379559984 | 1 | < 0.1% |
| 0.9802428343 | 1 | < 0.1% |
| 1.741703104 | 1 | < 0.1% |
| 1.134568181 | 1 | < 0.1% |
| 0.2665870765 | 1 | < 0.1% |
| 0.7335075248 | 1 | < 0.1% |
| 2.782271205 | 1 | < 0.1% |
| 1.978348171 | 1 | < 0.1% |
| 0.5031865176 | 1 | < 0.1% |
| 36.80955825 | 1 | < 0.1% |
| Other values (523880) | 523880 |
| Value | Count | Frequency (%) |
| 0.0004017200743 | 1 | |
| 0.0006314008636 | 1 | |
| 0.0007112982751 | 1 | |
| 0.0008050747499 | 1 | |
| 0.0009187268328 | 1 | |
| 0.001109436925 | 1 | |
| 0.001260449352 | 1 | |
| 0.001468048982 | 1 | |
| 0.001472858267 | 1 | |
| 0.001479125851 | 1 |
| Value | Count | Frequency (%) |
| 78.1807838 | 1 | |
| 78.18077104 | 1 | |
| 78.17726023 | 1 | |
| 78.17396114 | 1 | |
| 78.17285036 | 1 | |
| 78.17158499 | 1 | |
| 78.17071626 | 1 | |
| 78.16984971 | 1 | |
| 78.16739028 | 1 | |
| 78.1663704 | 1 |
| Distinct | 523890 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.571314593 |
| Minimum | 0.003447101248 |
|---|---|
| Maximum | 3.139177111 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0.003447101248 |
|---|---|
| 5-th percentile | 0.463304147 |
| Q1 | 1.052298399 |
| median | 1.571644949 |
| Q3 | 2.088804202 |
| 95-th percentile | 2.68383875 |
| Maximum | 3.139177111 |
| Range | 3.13573001 |
| Interquartile range (IQR) | 1.036505803 |
Descriptive statistics
| Standard deviation | 0.6784044163 |
|---|---|
| Coefficient of variation (CV) | 0.4317432165 |
| Kurtosis | -0.7963667357 |
| Mean | 1.571314593 |
| Median Absolute Deviation (MAD) | 0.5182879822 |
| Skewness | 0.003163044846 |
| Sum | 823196.002 |
| Variance | 0.4602325521 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.121827361 | 1 | < 0.1% |
| 1.127709356 | 1 | < 0.1% |
| 1.894106388 | 1 | < 0.1% |
| 1.645172239 | 1 | < 0.1% |
| 0.5123234469 | 1 | < 0.1% |
| 2.922814051 | 1 | < 0.1% |
| 2.127515895 | 1 | < 0.1% |
| 0.219976207 | 1 | < 0.1% |
| 1.918943361 | 1 | < 0.1% |
| 2.365234006 | 1 | < 0.1% |
| Other values (523880) | 523880 |
| Value | Count | Frequency (%) |
| 0.003447101248 | 1 | |
| 0.003469401599 | 1 | |
| 0.005321450579 | 1 | |
| 0.005645090662 | 1 | |
| 0.0063671429 | 1 | |
| 0.006408109402 | 1 | |
| 0.006683624021 | 1 | |
| 0.007229995062 | 1 | |
| 0.007351265922 | 1 | |
| 0.008955810456 | 1 |
| Value | Count | Frequency (%) |
| 3.139177111 | 1 | |
| 3.13865275 | 1 | |
| 3.137510503 | 1 | |
| 3.137246888 | 1 | |
| 3.136847471 | 1 | |
| 3.135266629 | 1 | |
| 3.135097951 | 1 | |
| 3.13384206 | 1 | |
| 3.133476024 | 1 | |
| 3.132516798 | 1 |
| Distinct | 523890 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.001858482681 |
| Minimum | -3.141583493 |
|---|---|
| Maximum | 3.141590759 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 262524 |
| Negative (%) | 50.1% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | -3.141583493 |
|---|---|
| 5-th percentile | -2.823393021 |
| Q1 | -1.574115916 |
| median | -0.006785427 |
| Q3 | 1.574189431 |
| 95-th percentile | 2.823762666 |
| Maximum | 3.141590759 |
| Range | 6.283174253 |
| Interquartile range (IQR) | 3.148305347 |
Descriptive statistics
| Standard deviation | 1.813322318 |
|---|---|
| Coefficient of variation (CV) | -975.7004123 |
| Kurtosis | -1.203226292 |
| Mean | -0.001858482681 |
| Median Absolute Deviation (MAD) | 1.573982566 |
| Skewness | 0.003482621188 |
| Sum | -973.6404918 |
| Variance | 3.28813783 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.3964367327 | 1 | < 0.1% |
| 2.297844709 | 1 | < 0.1% |
| 1.572672958 | 1 | < 0.1% |
| -3.059062467 | 1 | < 0.1% |
| -2.103024796 | 1 | < 0.1% |
| -2.678470412 | 1 | < 0.1% |
| 1.682209852 | 1 | < 0.1% |
| -2.060800928 | 1 | < 0.1% |
| -1.194083062 | 1 | < 0.1% |
| 0.9200304069 | 1 | < 0.1% |
| Other values (523880) | 523880 |
| Value | Count | Frequency (%) |
| -3.141583493 | 1 | |
| -3.141580634 | 1 | |
| -3.1415638 | 1 | |
| -3.141559946 | 1 | |
| -3.141544847 | 1 | |
| -3.141542478 | 1 | |
| -3.141529285 | 1 | |
| -3.141526393 | 1 | |
| -3.141509263 | 1 | |
| -3.14150156 | 1 |
| Value | Count | Frequency (%) |
| 3.141590759 | 1 | |
| 3.141555111 | 1 | |
| 3.141547009 | 1 | |
| 3.141539749 | 1 | |
| 3.141537017 | 1 | |
| 3.141535679 | 1 | |
| 3.141508933 | 1 | |
| 3.141503416 | 1 | |
| 3.14150236 | 1 | |
| 3.141486297 | 1 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1311676541 |
| Minimum | 0 |
|---|---|
| Maximum | 0.9395700097 |
| Zeros | 261398 |
| Zeros (%) | 49.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 4.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.0005109999911 |
| Q3 | 0.1395699978 |
| 95-th percentile | 0.4976100028 |
| Maximum | 0.9395700097 |
| Range | 0.9395700097 |
| Interquartile range (IQR) | 0.1395699978 |
Descriptive statistics
| Standard deviation | 0.21810892 |
|---|---|
| Coefficient of variation (CV) | 1.662825499 |
| Kurtosis | 5.865233072 |
| Mean | 0.1311676541 |
| Median Absolute Deviation (MAD) | 0.0005109999911 |
| Skewness | 2.465418255 |
| Sum | 68717.42232 |
| Variance | 0.04757150097 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) |
| 0 | 261398 | |
| 0.1395699978 | 191517 | |
| 0.4936800003 | 27600 | 5.3% |
| 0.4976100028 | 13503 | 2.6% |
| 0.9382699728 | 11671 | 2.2% |
| 0.9395700097 | 11143 | 2.1% |
| 0.0005109999911 | 4977 | 1.0% |
| 0.105659999 | 2081 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 261398 | |
| 0.0005109999911 | 4977 | 1.0% |
| 0.105659999 | 2081 | 0.4% |
| 0.1395699978 | 191517 | |
| 0.4936800003 | 27600 | 5.3% |
| 0.4976100028 | 13503 | 2.6% |
| 0.9382699728 | 11671 | 2.2% |
| 0.9395700097 | 11143 | 2.1% |
| Value | Count | Frequency (%) |
| 0.9395700097 | 11143 | 2.1% |
| 0.9382699728 | 11671 | 2.2% |
| 0.4976100028 | 13503 | 2.6% |
| 0.4936800003 | 27600 | 5.3% |
| 0.1395699978 | 191517 | |
| 0.105659999 | 2081 | 0.4% |
| 0.0005109999911 | 4977 | 1.0% |
| 0 | 261398 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 511.7 KiB |
| True | |
|---|---|
| False | 20155 |
| Value | Count | Frequency (%) |
| True | 503735 | |
| False | 20155 | 3.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 511.7 KiB |
| False | |
|---|---|
| True | 20000 |
| Value | Count | Frequency (%) |
| False | 503890 | |
| True | 20000 | 3.8% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 511.7 KiB |
| False | |
|---|---|
| True | 155 |
| Value | Count | Frequency (%) |
| False | 523735 | |
| True | 155 | < 0.1% |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 29.0 MiB |
| 0 | |
|---|---|
| 1 | 20000 |
| 2 | 155 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 523890 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 523890 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 523890 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 523890 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 503735 | |
| 1 | 20000 | 3.8% |
| 2 | 155 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| pid | pos_r | pos_theta | pos_phi | pos_t | mom_p | mom_theta | mom_phi | mom_mass | isHiggs | isZ | isOther | label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 12 | 0.0 | 0.0 | 0.0 | 0.0 | 57.517587 | 2.785539 | -2.578892 | 0.00000 | False | True | False | 1 |
| 1 | -12 | 0.0 | 0.0 | 0.0 | 0.0 | 47.253770 | 0.986708 | -0.963875 | 0.00000 | False | True | False | 1 |
| 2 | 2112 | 0.0 | 0.0 | 0.0 | 0.0 | 3.378490 | 2.085558 | 1.405320 | 0.93957 | True | False | False | 0 |
| 3 | -2212 | 0.0 | 0.0 | 0.0 | 0.0 | 2.899987 | 1.983336 | 1.315393 | 0.93827 | True | False | False | 0 |
| 4 | 321 | 0.0 | 0.0 | 0.0 | 0.0 | 3.256114 | 1.777417 | 1.411150 | 0.49368 | True | False | False | 0 |
| 5 | -211 | 0.0 | 0.0 | 0.0 | 0.0 | 2.257151 | 0.843947 | 1.467199 | 0.13957 | True | False | False | 0 |
| 6 | -211 | 0.0 | 0.0 | 0.0 | 0.0 | 0.589905 | 0.908871 | 2.463460 | 0.13957 | True | False | False | 0 |
| 7 | -211 | 0.0 | 0.0 | 0.0 | 0.0 | 2.088260 | 0.700026 | -2.404715 | 0.13957 | True | False | False | 0 |
| 8 | -211 | 0.0 | 0.0 | 0.0 | 0.0 | 1.136461 | 0.589039 | -1.753818 | 0.13957 | True | False | False | 0 |
| 9 | 211 | 0.0 | 0.0 | 0.0 | 0.0 | 0.800887 | 0.890430 | -1.411619 | 0.13957 | True | False | False | 0 |
Last rows
| pid | pos_r | pos_theta | pos_phi | pos_t | mom_p | mom_theta | mom_phi | mom_mass | isHiggs | isZ | isOther | label | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 523880 | 22 | 0.000266 | 0.803973 | 0.389445 | 0.000266 | 0.244316 | 0.942980 | 0.356160 | 0.00000 | True | False | False | 0 |
| 523881 | 22 | 5.174263 | 0.799260 | 0.388213 | 5.182875 | 0.315674 | 1.141822 | 0.527419 | 0.00000 | True | False | False | 0 |
| 523882 | 321 | 5.174263 | 0.799260 | 0.388213 | 5.182875 | 7.409359 | 0.789252 | 0.352566 | 0.49368 | True | False | False | 0 |
| 523883 | -211 | 5.174263 | 0.799260 | 0.388213 | 5.182875 | 8.702533 | 0.811139 | 0.431911 | 0.13957 | True | False | False | 0 |
| 523884 | 22 | 47.644578 | 1.553182 | -1.714619 | 55.214441 | 0.064417 | 0.950241 | 1.743621 | 0.00000 | True | False | False | 0 |
| 523885 | 22 | 47.644578 | 1.553182 | -1.714619 | 55.214441 | 0.092689 | 1.407499 | -2.085675 | 0.00000 | True | False | False | 0 |
| 523886 | 22 | 47.644611 | 1.553182 | -1.714619 | 55.214445 | 0.621426 | 1.619202 | -1.747628 | 0.00000 | True | False | False | 0 |
| 523887 | 22 | 47.644611 | 1.553182 | -1.714619 | 55.214445 | 0.206108 | 1.607157 | -1.368038 | 0.00000 | True | False | False | 0 |
| 523888 | 211 | 5.174263 | 0.799260 | 0.388213 | 5.182875 | 13.300955 | 0.783574 | 0.373155 | 0.13957 | True | False | False | 0 |
| 523889 | -211 | 5.174263 | 0.799260 | 0.388213 | 5.182875 | 2.612154 | 0.830888 | 0.395118 | 0.13957 | True | False | False | 0 |